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[Name of Document] SPECIFICATION 

[Title of the Invention] DATA TRANSMISSION SYSTEM, DATA 

TRANSMITTING APPARATUS AND METHOD, AND SCENE 
DESCRIPTION PROCESSING UNIT AND METHOD 

[ Claims ] 

[Claim 1] A data transmission system having a 
transmitting apparatus that transmits scene description data 
which describes the structures of one or more signals in a 
scene, and a receiving apparatus that constructs the scene 
according to the scene description data, wherein: 

said transmitting apparatus has a scene description 
processing means that outputs the scene description data 
which conforms to the state of a transmission line and/or a 
request issued from said receiving apparatus. 

[Claim 2] A data transmission system according to Claim 1 
further comprising a memory means in which a plurality of 
predefined scene description data is stored, wherein: 

said scene description processing means outputs a scene 
description selectively from among the plurality of scene 
descriptions stored in said memory means. 

[Claim 3] A data transmission system according to Claim 1 
further comprising a memory means in which a plurality of 
predefined scene descriptions is stored, wherein: 

said scene description processing means converts 
predefined scene description data read from said memory 
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means into another scene description data, and outputs the 
resultant scene description data, 

[ Claim 4] A data transmission system according to Claim 1, 
wherein said scene description processing means encodes a 
scene description and transfers the resultant scene 
description . 

[Claim 5] A data transmission system according to Claim 1, 
wherein : 

said transmitting apparatus includes a signal 
processing means that outputs one or more signals, which 
conform to the state of a transmission line and/or a request 
issued from said receiving apparatus, as one or more signals 
to be used to construct a scene; and 

said scene description processing means outputs the 
scene description data that conforms to a transmission rate 
for a signal transferred from said signal processing means 
and/or quality. 

[Claim 6] A data transmission system according to Claim 1, 
wherein : 

said transmitting apparatus includes a signal 
processing means that transfers one or more signals, which 
conform to the state of a transmission line and/or a request 
issued from said receiving apparatus, as one or more signals 
to be used to construct a scene; and 

said scene description processing means outputs the 
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scene description data that includes information necessary 
for said receiving apparatus to decode the signals output 
from said signal processing means. 

[Claim 7] A data transmission system according to Claim 1, 
wherein: 

said transmitting apparatus includes a signal 
processing means that transfers one or more signals, which 
conform to the state of a transmission line and/or a request 
issued from said receiving apparatus, as one or more signals 
to be used to construct a scene; and 

said scene description processing means outputs the 
scene description data that specifies whether the signals to 
be used to construct the scene are used or not . 

[Claim 8] A data transmission system according to Claim 1, 
wherein said scene description processing means outputs a 
scene description whose complexity conforms to the state of 
a transmission line and/or a request issued from said 
receiving apparatus . 

[Claim 9] A data transmission system according to Claim 8, 
wherein said scene description processing means outputs a 
scene description, with which a first part scene within a 
scene is replaced with a second part scene whose complexity 
is different from the complexity of the first part scene, in 
conformity with the state of a transmission line and/or a 
request issued from said receiving apparatus. 
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[Claim 10] A data transmission system according to Claim 
8, wherein said scene description processing means outputs 
the scene description data, with which a part scene within a 
scene is removed or a new part scene is added to the scene, 
in conformity with the state of a transmission line and/or a 
request issued from said receiving apparatus. 

[Claim 11] A data transmission system according to Claim 
8, wherein said scene description processing means modifies 
a quantization step, at which the scene description data is 
encoded, in conformity with the state of a transmission line 
and/or a request issued from said receiving apparatus. 

[Claim 12] A data transmission system according to Claim 
1, wherein said scene description processing means divides 
the scene description data into a plurality of decoding 
units in conformity with the state of a transmission line 
and/or a request issued from said receiving apparatus, and 
then outputs the resultant scene description data. 

[Claim 13] A data transmission system according to Claim 
12, wherein said scene description processing means adjusts 
a time interval between time instants at which said 
receiving apparatus decodes each of the plurality of 
decoding units into which the scene description data is 
divided. 

[Claim 14] A data transmitting method for transmitting 
scene description data that describes the structures of one 
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or more signals In a scene, and constructing the scene 
according to the scene description data, wherein: 

the scene description data that conforms to the state 
of a transmission line and/or a request Issued from a 
receiving side Is transmitted. 

[Claim 15] A data transmitting method according to Claim 
14 , wherein: 

a plurality of predefined scene description data Is 
stored; and 

the scene description data Is selectively read from 
among the plurality of stored scene descriptions, and then 
transmitted. 

[Claim 16] A data transmission method according to Claim 
14, wherein: 

predefined scene description data are stored; and 
any of the predefined scene descriptions that are 
stored Is read, converted Into another scene description 
data, and then transmitted. 

[Claim 17] A data transmission method according to Claim 
14, wherein a scene description Is encoded and transmitted. 

[Claim 18] A data transmission method according to Claim 
14, wherein: 

one or more signals that conform to the state of a 
transmission line and/or a request Issued from a receiving 
side are transmitted as one or more signals to be used to 
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construct a scene; and 

the scene description data that conforms to a 
transmission rate at which the signals are transmitted in 
compliance with the state of a transmission line and/or a 
request issued from a receiving side, and/or quality is 
transmitted. 

[Claim 19] A data transmitting method according to Claim 
14, wherein: 

one or more signals that conform to the state of a 
transmission line and/or a request issued from a receiving 
side are transmitted as one or more signals to be used to 
construct a scene; and 

the scene description data that includes information 
necessary for a receiving side to restore the signals 
transmitted in conformity with the state of the transmission 
line and/or the request issued from the receiving side is 
transmitted. 

[Claim 20] A data transmission method according to Claim 
14, wherein: 

one or more signals that conform to the state of a 
transmission line and/or a request issued from a receiving 
side are transmitted as one or more signals to be used to 
construct a scene; and 

the scene description data that specifies whether the 
signals to be used to construct a scene are used or not are 
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transmitted. 

[Claim 21] A data transmission method according to Claim 
14, wherein the scene description data whose complexity 
conforms to the state of a transmission line and/or a 
request issued from a receiving side is transmitted. 

[Claim 22] A data transmission method according to Claim 
21, wherein the scene description data with which a first 
part scene within a scene is replaced with a second part 
scene whose complexity is different from the complexity of 
the first part scene is transmitted in conformity with the 
state of a transmission line and/or a request issued from a 
receiving side. 

[Claim 23] A data transmitting method according to Claim 
21, wherein, the scene description data with which a part 
scene within a scene is removed or a new part scene is added 
to the scene is transmitted in conformity with the state of 
a transmission line and/or a request issued from a receiving 
side . 

[Claim 24] A data transmitting method according to Claim 
21, wherein a quantization step at which the scene 
description data is encoded is modified in conformity with 
the state of a transmission line and/or a request issued 
from a receiving side. 

[Claim 25] A data transmitting method according to Claim 
14, wherein the scene description data is divided into a 
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plurality of decoding units in conformity with the state of 
a transmission line and/or a request issued from a receiving 
side, and then transmitted. 

[Claim 26] A data transmitting method according to Claim 
25, wherein a time interval between time instants at which a 
receiving side decodes each of the plurality of decoding 
units into which the scene description data is divided is 
adjusted. 

[Claim 27] A data transmitting apparatus for transmitting 
scene description data that describes the structures of one 
or more signals in a scene, comprising: 

a scene description processing means for outputting the 
scene description data that conforms to the state of a 
transmission line and/or a request issued from a receiving 
side . 

[Claim 28] A data transmitting apparatus according to 
Claim 27, further comprising: 

a memory means in which a plurality of predefined scene 
description data is stored, wherein: 

said scene description processing means selectively 
reads the scene description data from among the plurality of 
scene descriptions stored in said memory means, and outputs 
the selected scene description data. 

[Claim 29] A data transmitting apparatus according to 
Claim 27, further comprising: 
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a memory means in which predefined scene description 
data are stored, wherein: 

said scene description processing means converts a 
predefined scene description data read from said memory 
means into another scene description data, and outputs the 
resultant scene description data. 

[Claim 30] A data transmitting apparatus according to 
Claim 27, wherein said scene description processing means 
encodes the scene description data and transmits the 
resultant scene description data. 

[Claim 31] A data transmitting apparatus according to 
Claim 27, further comprising a signal processing means that 
outputs one or more signals, which conform to the state of a 
transmission line and/or a request ^ issued from a receiving 
side, as one or more signals to be used to construct a scene, 
wherein : 

said scene description processing means outputs the 
scene description data that conforms to a transmission rate 
for the signals output from said signal processing means 
and/or quality. 

[Claim 32] A data transmitting apparatus according to 
Claim 27, further comprising a signal processing means that 
outputs one or more signals, which conform to the state of a 
transmission line and/or a request issued from a receiving 
side, as one or more signals to be used to construct a scene. 
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wherein : 

said scene description processing means outputs the 
scene description data that includes information necessary 
for a receiving side to decode the signals transferred from 
said signal processing means. 

[Claim 33] A data transmitting apparatus according to 
Claim 27, further comprising a signal processing means that 
outputs one or more signals, which conform to the state of a 
transmission line and/or a request issued from a receiving 
side, as one or more signals to be used to construct a scene, 
wherein: 

said scene description processing means outputs the 
scene description data that specifies whether the signals to 
be used to construct a scene are used or not . 

[Claim 34] A data transmitting apparatus according to 
Claim 27, wherein said scene description processing means 
outputs the scene description data whose complexity conforms 
to the state of a transmission line and/or a request issued 
from a receiving side. 

[Claim 35] A data transmitting apparatus according to 
Claim 34, wherein said scene description processing means 
outputs the scene description data, with which a first part 
scene within a scene is replaced with a second part scene 
whose complexity is different from the complexity of the 
first part scene, in conformity with the state of a 
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transmission line and/or a request issued from a receiving 
side . 

[Claim 36] A data transmitting apparatus according to 
Claim 34, wherein said scene description processing means 
outputs the scene description data, with which a part scene 
within a scene is removed or a new part scene is added to 
the scene, in conformity with the state of a transmission 
line and/or a request issued from a receiving side. 

[Claim 37] A data transmitting apparatus according to 
Claim 34, wherein said scene description processing means 
modifies a quantization step, at which the scene description 
data is encoded, in conformity with the state of a 
transmission line and/or a request issued from a receiving 
side. 

[Claim 38] A data transmitting apparatus according to 
Claim 27, wherein said scene description processing means 
divides the scene description data into a plurality of 
decoding units in conformity with the state of a 
transmission line and/or a request issued from a receiving 
side. 

[Claim 39] A data transmitting apparatus according to 
Claim 38, wherein said scene description processing means 
adjusts a time interval between time instants at which a 
receiving side decodes each of the plurality of decoding 
units into which the scene description data is divided. 



- 11 - 



[Claim 40] A data transmitting method for transmitting 
scene description data that describes the structures of one 
or more signals in a scene, wherein: 

a scene description that conforms to the state of a 
transmission line and/or a request issued from a receiving 
side is transmitted. 

[Claim 41] A data transmitting method according to Claim 
40, wherein a plurality of predefined scene description data 
is stored, and the scene description data selectively read 
from among the plurality of scene description data that are 
stored is transmitted. 

[Claim 42] A data transmitting method according to Claim 
40, wherein predefined scene description data are stored, 
and a predefined scene description data that is stored is 
read, converted into another scene description data, and 
then transmitted. 

[Claim 43] A data transmitting method according to Claim 
40, wherein the scene description data is encoded and 
transmitted. 

[Claim 44] A data transmitting method according to Claim 
40, wherein: 

one or more signals that conform to the state of a 
transmission line and/or a request issued from a receiving 
side are transmitted as one or more signals to be used to 
construct a scene; and 
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the scene description data that conforms to a 
transmission rate at which the signals are transmitted in 
conformity with the state of a transmission line and/or a 
request Issued from a receiving side, and/or quality is 
transmitted. 

[Claim 45] A data transmitting method according to Claim 
40 , wherein : 

one or more signals that conform to the state of a 
transmission line and/or a request issued from a receiving 
side are transmitted as one or more signals to be used to 
construct a scene; and 

the scene description data that includes information 
necessary for a receiving side to decode the signals 
transmitted in conformity with the state of a transmission 
line and/or a request issued from the receiving side. 

[Claim 46] A data transmitting method according to Claim 
40, wherein: 

one or more signals that conform to the state of a 
transmission line and/or a request issued from a receiving 
side are transmitted as one or more signals to be used to 
construct a scene; and 

the scene description data that specifies whether the 
signals to be used to construct a scene are used or not is 
transmitted. 

[Claim 47] A data transmitting method according to Claim 
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40, wherein the scene description data whose complexity 
conforms to the state of a transmission line and/or a 
request issued from a receiving side is transmitted. 

[Claim 48] A data transmitting method according to Claim 
47, wherein the scene description data, with which a first 
part scene within a scene is replaced with a second part 
scene whose complexity is different from the complexity of 
the first part scene, is transmitted in confonttity with the 
state of a transmission line and/or a request issued from a 
receiving side. 

[Claim 49] A data transmitting method according to Claim 
47, wherein the scene description data, with which a part 
scene within a scene is removed or a new part scene is added 
to the scene, is output in conformity with the state of a 
transmission line and/or a request issued from a receiving 
side . 

[Claim 50] A data transmitting method according to Claim 
47, wherein a quantization step at which the scene 
description data is encoded is modified in conformity with 
the state of a transmission line and/or a request issued 
from a receiving side. 

[Claim 51] A data transmitting method according to Claim 
40, wherein the scene description data is divided into a 
plurality of decoding units in conformity with the state of 
a transmission line and/or a request issued from a receiving 
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side . 

[Claim 52] A data transmitting method according to Claim 
51, wherein a time interval between time instants at which a 
receiving side decodes each of the plurality of decoding 
units into which the scene description data is divided is 
adjusted. 

[Claim 53] A scene description processing unit for 
processing scene description data that describes the 
structures of one or more signals in a scene, wherein: 

when the scene description data is transmitted over a 
transmission line, the scene description data that conforms 
to the state of the transmission line and/or a request 
issued from a receiving side is output. 

[Claim 54] A scene description processing unit according 
to Claim 53, wherein the scene description data is 
selectively read from among a plurality of predefined scene 
descriptions, and then output. 

[Claim 55] A scene description processing unit according 
to Claim 53, wherein a predefined scene description data is 
converted into another scene description data, and then 
output . 

[Claim 56] A scene description processing unit according 
to Claim 53, wherein the scene description data is decoded 
and output . 

[Claim 57] A scene description processing unit according 
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to Claim 53, wherein the scene description data that 
conforms to a transmission rate, at which the signals are 
output in conformity with the state of a transmission line 
and/or a request issued from a receiving side as one or more 
signals to be used to construct a scene, and/or quality is 
output . 

[Claim 58] A scene description processing unit according 
to Claim 53, wherein the scene description data that 
includes information necessary for a receiving side to 
decode the signals , which are output in conformity with the 
state of a transmission line and/or a request issued from 
the receiving side as one or more signals to be used to 
construct a scene, is output. 

[Claim 59] scene description processing unit according to 
Claim 53, wherein the scene description data that specifies 
whether the signals that are output in conformity with the 
state of a transmission line and/or a request issued from a 
receiving side as one or more signals to be used to 
construct a scene are used or not is output. 

[Claim 60] A scene description processing unit according 
to Claim 53, wherein the scene description data whose 
complexity conforms to the state of a transmission line 
and/or a request issued from a receiving side is output. 

[Claim 61] A scene description processing unit according 
to Claim 60, wherein the scene description data with which a 
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first part scene within a scene is replaced with a second 
part scene whose complexity is different from the complexity 
of the first part scene is output in conformity with the 
state of a transmission line and/or a request issued from a 
receiving side . 

[Claim 62] A scene description processing unit according 
to Claim 60, wherein the scene description data with which a 
part scene within a scene is removed or a new part scene is 
added to the scene is output in conformity with the state of 
a transmission line and/or a request issued from a receiving 
side. 

[Claim 63] A scene description processing unit according 
to Claim 60, wherein a quantization step at which the scene 
description data is encoded is modified in conformity with 
the state of a transmission line and/or a request issued 
from a receiving side. 

[Claim 64] A scene description processing unit according 
to Claim 53, wherein the scene description data is divided 
into a plurality of decoding units in conformity with the 
state of a transmission line and/or a request issued from a 
receiving side, and then output, 

[Claim 65] A scene description processing unit according 
to Claim 64, wherein a time interval between time instants 
at which a receiving side decodes each of the plurality of 
decoding units into which the scene description data is 
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divided is ad jus ted . 

[Claim 66] A scene description processing method for 
processing scene description data that describes the 
structures of one or more signals in a scene, wherein: 

when a scene description is transmitted over a 
transmission line, the scene description data that conforms 
to the state of the transmission line and/or a request 
issued from a receiving side is output. 

[Claim 67] A scene description processing method 
according to Claim 66, wherein the scene description data is 
selectively read from among a plurality of predefined scene 
descriptions, and then output. 

[Claim 68] A scene description processing method 
according to Claim 66, wherein a predefined scene 
description data is converted into another scene description 
data, and then output. 

[Claim 69] A scene description processing method 
according to Claim 66, wherein the scene description data is 
encoded and then output . 

[Claim 70] A scene description processing method 
according to Claim 66, wherein the scene description data 
that conforms to a transmission rate, at which the signals 
are output in conformity with the state of a transmission 
line and/or a request issued from a receiving signal as one 
or more signals to be used to construct a scene, and/or 
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quality is output . 

[Claim 71] A scene description processing method 
according to Claim 66, wherein the scene description data 
that includes information necessary for a receiving side to 
decode the signals that are output in conformity with the 
state of a transmission line and/or a request issued from 
the receiving side as one or more signals to be used to 
construct a scene is transferred. 

[Claim 72] A scene description processing method 
according to Claim 66, wherein the scene description data 
that specifies whether the signals which are output in 
conformity with the state of a transmission line and/or a 
request issued from a receiving side as one or more signals 
to be used to construct a scene are used or not is output. 

[Claim 73] A scene description processing method 
according to Claim 66, wherein the scene description data 
whose complexity conforms to the state of a transmission 
line and/or a request issued from a receiving side is output. 

[Claim 74] A scene description processing method 
according to Claim 73, wherein the scene description data 
with which a first part scene within a scene is replaced 
with a second part scene whose complexity is different from 
the complexity of the first part scene is output in 
conformity with the state of a transmission line and/or a 
request issued from a receiving side. 
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[Claim 75] A scene description processing method 
according to Claim 73, wherein the scene description data 
with which a part scene within a scene is removed or a new 
part scene is added to the scene is output in conformity 
with the state of a transmission line and/or a request 
issued from a receiving side. 

[Claim 76] A scene description processing method 
according to Claim 73, wherein a quantization step at which 
the scene description data is encoded is modified in 
conformity with the state of a transmission line and/or a 
request issued from a receiving side. 

[Claim 77] A scene description processing method 
according to Claim 66, wherein the scene description data is 
divided into a plurality of decoding units in conformity 
with the state of a transmission line and/or a request 
issued from a receiving side. 

[Claim 78] A scene description processing method 
according to Claim 77, wherein a time inteirval between time 
instants at which a receiving side decodes each of the 
plurality of decoding units into which the scene description 
data is divided is adjusted. 
[Detailed Description of the Invention] 
[0001] 

[Technical Field of the Invention] 

The present invention relates to a data transmission 



- 20 - 



system, a data transmitting apparatus and method, and a 
scene description processing unit and method, wherein scene 
description data based on which a scene is constructed using 
multimedia data that includes a still image signal, a motion 
picture signal, an acoustic signal, text data, and graphic 
data is distributed over a network, received by receiving 
terminals, and decoded for display. 
[0002] 

[Description of the Related Art] 

Fig, 20 shows the configuration of a conventional data 
distribution system in which a motion picture signal, an 
acoustic signal, and others are transmitted over a 
transmission medium, received by receiving terminals, and 
decoded for display. Hereinafter, the motion picture signal, 
acoustic signal, and others that are encoded in compliance 
with the ISO/IEC 13818 (so called the MPEG2 ) shall be 
referred to as elementary streams (ESs). 
[0003] 

Referring to Fig. 20, an ES processing unit 103 
included in a server 100 selects any of ESs stored in 
advance in a memory 104 or receives a baseband image signal 
or acoustic signal that is not shown, and encodes the ES or 
signal. At this time, a plurality of ESs may be selected. 
A transmission control unit 105 included in the server 100 
multiplexes a plurality of ESs if necessary, encodes a 
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resultant signal according to a protocol according to which 
a signal is transmitted over a transmission medium 107, and 
transmits the signal to a receiving terminal 108, 
[0004] 

A reception control unit 109 included in the receiving 
terminal 108 decodes the signal, which is transmitted over 
the transmission medium 107, according to the protocol. If 
necessary, the reception control unit 109 separates the 
multiplexed ESs, and hands the ESs to associated ES decoding 
units 112. The ES decoding unit 112 decodes an ES to 
restore a motion picture signal, an acoustic signal, or the 
like, and sends the signal to a display sounding unit 113 
that includes a television monitor and a loudspeaker. 
Consequently, images are displayed on the television monitor 
and sounds are radiated through the loudspeaker. 

[0005] 

The server 100 corresponds to a transmission system 
installed at a broadcasting station that provides a 
broadcasting service, or an Internet server or a home server 
that gives access to the Internet. Moreover, the receiving 
tearminal 108 corresponds to a receiver for receiving a 
broadcast signal or a personal computer. 

[0006] 

There is a drawback that a change in a bandwidth 
offered by a transmission line (transmission medium 107) for 
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transmitting ESs or a traffic- jammed state on the 
transmission line leads to a delay in data transmission or a 
loss in transmitted data. 
[0007] 

In order to overcome the drawback, the data 
distribution system shown in Fig. 20 performs actions 
described below. 

[0008] 

The server 100 (for example, the transmission control 
unit 105) assigns a (encoded) serial number to each packet 
in the form of which data is transmitted over a transmission 
line. The reception control unit 109 in the receiving 
terminal 108 monitors each packet received over the 
transmission line to see if an assigned (encoded) serial 
number is missing, and thus detects a loss in data (data 
loss rate). Othearwise, the server 100 (for example, the 
transmission control unit 105) appends (encoded) time 
instant information to data to be transmitted over the 
transmission line. The reception control unit 109 in the 
receiving terminal 108 monitors data received over the 
transmission line to see if (encoded) time instant 
information is appended to the data, and detects a delay in 
transmission from the time instant information. The 
reception control unit 109 in the receiving terminal 108 
detects a data loss rate on the transmission line or a delay 
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in transmission thereon, and transmits (reports) the 
detected information to a data- transmitted state detector 
106 included in the server 100. 
[0009] 

The data- transmitted state detector 106 in the server 
100 receives the data loss rate on the transmission line or 
the delay in transmission thereon from the reception control 
unit 109 in the receiving terminal 108, and detects a 
bandwidth offered by the transmission line or a traffic- 
jammed state occurring thereon. If the data loss is large, 
the data-transmitted state detector 106 judges that the 
transmission line is jammed, or if the transmission delay is 
increased, the data- transmitted state detector 106 judges 
that the transmission line is jammed. Moreover, if a 
transmission line of a bandwidth reservation type is 
employed, the data- transmitted state detector 106 can detect 
an available bandwidth (bandwidth offered by the 
transmission line) usable by the server 100. When a 
transmission medium dominated by weather conditions, such as, 
radio waves is employed, a user may designate a bandwidth in 
advance according to the weather conditions . The 
information of a data- transmitted state detected by the 
data-transmitted state detector 106 is sent to a conversion 
control unit 101. 

[0010] 
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The conversion control unit 101 extends control 
according to the information of a detected bandwidth offered 
by a transmission line or a traffic- jammed state on the 
transmission line so that the ES processing unit 103 
switches ESs that are transmitted at different bit rates . 
Otherwise, when the ES processing unit 103 encodes an ES in 
compliance with the ISO/IEC 13818 (so called the MPEG2 ) , the 
conversion control unit 101 adjusts the encoding bit rate. 
Specifically, if it is detected that a transmission line is 
jammed, the ES processing unit 103 transfers an ES that is 
transmitted at a low bit rate. Consequently, a delay in 
data transmission can be avoided. 

[0011] 

Moreover, for example, an unspecified large number of 
receiving terminals 108 may be connected to the server 100, 
and the specifications for the receiving terminals 108 may 
not be uniform. Therefore, the server 100 may have to 
transmit an ES to the receiving terminals whose processing 
abilities are different from one another. In the case of 
this system configuration, the receiving terminals 108 each 
include a transmission request processing unit 110. The 
transmission request processing unit 110 produces a 
transmission request signal to request an ES that conforms 
to the processing ability of the own receiving terminal 108. 
The transmission request signal is transmitted from the 
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reception control unit 109 to the server 100. The 
transmission request signal includes a signal that expresses 
the ability of the own receiving terminal 108. The signal 
that is transmitted from the transmission request processing 
unit 110 to the server 100 and that expresses the ability of 
the own receiving terminal 108 is a signal representing, for 
example, a memory size, a resolution offered by a display 
unit, an arithmetic capability, a buffer size, an ES 
encoding format permitting decoding, a number of decodable 
ESs, or a bit rate at which a decodable ES is transmitted. 
In response to the transmission request signal, the 
conversion control unit 101 in the server 100 controls the 
ES processing unit 103 so that an ES that conforms to the 
performance of the receiving terminal 108 will be 
transmitted. Talking of image signal conversion for 
converting one ES into another that conforms to the 
performance of the receiving terminal 108, which is 
performed by the ES control unit 103, for example, an image 
signal converting method the present applicant has already 
proposed may be adopted. 
[0012] 

[Problems to be Solved by the Invention] 

Incidentally, as far as conventional telecasting is 
concerned, one scene is composed basically of an image (a 
still image alone or a motion picture alone) and sounds. 
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Therefore, an image (a still image or motion picture) alone 
is displayed on the display screen of a conventional 
receiver (television receiver), and sounds alone are 
radiated from a loudspeaker. 
[0013] 

In recent years , it has been thought that one scene is 
constructed using multimedia data that includes a still 
image signal, a motion picture signal, an acoustic signal, 
text data, graphic data, and other various signals. Methods 
of describing the construction of a scene based on the 
multimedia data include a method employing the hypertext 
markup language (HTML) that is adopted for home pages of web 
sites on the so-called Internet. Also included are a method 
employing the MPEG- 4 binary format for the scene (BIFS) that 
is a scene description form stipulated in the ISO/IEC14496-1 , 
a method employing the virtual reality modeling language 
(VRML) stipulated in the ISO/IEC14772 , and a method 
employing Java (Trademark) . Hereinafter, data describing the 
construction of a scene shall be referred to as a scene 
description. The scene description may include ES 
information that is needed to decode an ES to be used to 
construct a scene. Examples of the scene description will 
be described later. 

[0014] 

The conventional data distribution system shown in Fig. 
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20 can construct and display a scene according to the scene 
description . 
[0015] 

However, for example, as mentioned above, a bit rate at 
which an ES is transmitted may be controlled based on a 
change in a bandwidth offered by a transmission line or in a 
traffic- jammed state on the transmission line, or based on 
the performance of a receiving terminal. Even in this case, 
the conventional data distribution system decodes the ES 
according to a scene construction described in the same 
scene description and displays a scene using the resultant 
ES- In other words, the conventional data distribution 
system decodes the ES according to the same scene 
construction irrespective of whether the ES processing unit 
103 modifies the ES, and displays a scene using the 
resultant ES. However, the scene construction cannot be 
said to be optimal for the modified ES. For example, if the 
bit rate for the ES is lowered, poor image quality may 
become distinctive. In contrast, although the bit rate for 
the ES is raised, an appropriate image may not be displayed. 

[0016] 

Moreover, the conventional data distribution system 
shown in Fig. 20 can transmit a scene description together 
with ES information needed to decode an ES. As mentioned 
above, the conventional data distribution system constructs 
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a scene according to the same scene description irrespective 
of whether the ES processing unit 103 modifies the ES. 
Therefore, for example, when the ES processing unit 103 
changes the parameters for encoding the ES, ES information 
needed to decode the ES cannot be acquired from the data 
description. In this case, in the conventional data 
distribution system, the ES decoding unit 112 in the 
receiving terminal 108 has to sample the information, which 
is needed to decode the ES, from the ES itself. 
Consequently, the receiving terminal 108 has to incur a 
larger processing load, and it takes much time for sampling. 
This poses a problem in that decoding of an ES and display 
of an image using the ES cannot be achieved within a desired 
period of time. 
[0017] 

Furthermore, according to the conventional data 
distribution system, for example, when an ES used to 
construct a scene fails to reach the receiving terminal 108, 
the reason why the ES has failed to reach the receiving 
terminal 108 cannot be judged. Specifically, it cannot be 
Judged whether the ES processing unit 103 in the server 100 
intends the failure to reach the receiving terminal 108, the 
ES is lost as a transmission loss, or the ES has not yet 
reached the receiving terminal 108 because of a delay in 
transmission. 
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[0018] 

On the other hand, scene description data may be 
distributed over a transmission line whose bandwidth is not 
constant but varies depending on a time or a channel. 
Otherwise, scene description data may be distributed to an 
unspecified large number of receiving terminals whose 
specifications are not predefined and whose processing 
abilities are different from one another. In this case, the 
server 100 in the conventional data distribution system has 
difficulty in determining an optimal scene construction in 
advance. In addition, a decoding unit in a receiving 
terminal may be realized with software, and the software of 
the decoding unit and software responsible for processing 
other than decoding may share the same CPU or memory. In 
this case, the processing ability of the decoding unit may 
vary dynamically. The server 100 cannot therefore determine 
an optimal scene description in advance. 

[0019] 

Moreover, in the case of the conventional data 
distribution system, the receiving terminal 108 may receive 
a scene construction that is too complex to decode an ES 
according to the scene construction and display a scene 
using the resultant ES . Otherwise, the receiving terminal 
108 may receive a scene description that describes numerous 
ESs. In this case, decoding the ESs and decoding the scene 
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description are not completed in time. Consequently, 
decoding and display may become asynchronous or a memory in 
which input data is stored temporarily may be overflowed. 
As a conceivable countermeasure , input data that cannot be 
processed by the receiving terminal 108 may be discarded. 
However, this leads to a fear that important data needed to 
construct a scene may be lost. Besides, a bandwidth is 
allocated in vain to transmission of data that is not used 
for image display. There is therefore a demand for the 
server 100 capable of distributing a scene description that 
conforms to the decoding ability or display ability of the 
receiving terminal 108. At present, such a server is 
unavailable . 
[0020] 

The present invention attempts to break through the 
foregoing situation. An object of the present invention is 
to provide a data transmission system, a data transmitting 
apparatus and method, and a scene description processing 
unit and method in which a scene description that conforms 
to the state of a transmission line or the processing 
ability of a receiving terminal can be transmitted to the 
receiving terminal. Moreover, a drawback such as unexpected 
missing of part of a scene which is not intended by a 
transmitting side is prevented from stemming from a loss 
that occurs on a transmission line or the insufficient 
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processing ability of a receiving terminal. Even when a bit 
rate at which an ES is transmitted is changed, the receiving 
terminal can decode the ES according to a scene construction 
that conforms to the bit rate, and display a scene using the 
resultant ES. Furthermore, a change in information needed 
to decode the ES can be explicitly reported to the receiving 
terminal. Consequently, the receiving terminal need not 
sample the information, which is needed to decode the ES, 
from the ES itself . 
[0021] 

[Means for Solving the Problems] 

According to the present invention, there is provided a 
data transmission system consisting mainly of a transmitting 
apparatus and a receiving apparatus. The transmitting 
apparatus transmits scene description data that describes 
the structures of one or more signals in a scene. The 
receiving apparatus constructs the scene according to the 
scene description. The transmitting apparatus includes a 
scene description processing means that output the scene 
description data that conforms to the state of a 
transmission line or a request issued from the receiving 
apparatus. The data transmission system thus solves the 
aforesaid problems. 
[0022] 

Moreover, according to the present invention, there is 
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provided a data transmitting method for transmitting scene 
description data, which describes the structures of one or 
more signals in a scene, and constructing the scene 
according to the scene description. According to the data 
transmitting method, the scene description data that 
conforms to the state of a transmission line and/or a 
request issued from a receiving side is transmitted. The 
data transmitting method thus solves the aforesaid problems. 
[0023] 

Next, according to the present invention, there is 
provided a data transmitting apparatus for transmitting 
scene description data that describes the structures of one 
or more signals in a scene. The data transmitting apparatus 
includes a scene description processing means for outputting 
the scene description data that conforms to the state of a 
transmission line and/or a request issued from a receiving 
side. Consequently, the data transmitting apparatus solves 
the aforesaid problems. 

[0024] 

Moreover, according to the present invention, there is 
provided a data transmitting method for transmitting scene 
description data that describes the structures of one or 
more signals in a scene. According to the data transmitting 
method, the scene description that conforms to the state of 
a transmission line and/or a request issued from a receiving 
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side is transmitted- The data transmitting method thus 
solves the aforesaid problems , 
[0025] 

Next, according to the present invention, there is 
provided a scene description processing unit for processing 
scene description data that describes the structures of one 
or more signals to be used to construct a scene. Herein, 
when the scene description data is transmitted over a 
transmission line, the scene description data that conforms 
to the state of a transmission line and/or a request issued 
from a receiving side is output. The scene description 
processing unit thus solves the aforesaid problems. 

[0026] 

Moreover, according to the present invention, there is 
provided a scene description processing method for 
processing scene description data that describes the 
structures of one or more signals in a scene. According to 
the scene description processing method, when the scene 
description data is transmitted over a transmission line, 
the scene description data that conforms to the state of the 
transmission lien and/or a request issued from a receiving 
side is output. The scene description processing method 
thus solves the aforesaid problems . 

[0027] 

According to the present invention, for example, a 
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server for distributing data includes a scene description 
processing means that dynamically processes a scene 
description in conformity with the state of a transmission 
line or a request for transmission issued from a receiving 
terminal. The server then transmits a scene description, 
which conforms to the state of the transmission line or the 
processing ability of the receiving terminal, to the 
receiving terminal. Herein, a bit rate at which an ES is 
transmitted may be changed based on the state of the 
transmission line or receiving terminal. In this case, a 
scene description optimal to the ES for which the bit rate 
has been changed is transmitted. Consequently, the 
receiving terminal can decode the ES according to a scene 
construction suitable for the bit rate for the ES, Moreover, 
a bit rate at which an ES is transmitted may be changed 
based on the state of the transmission line or receiving 
terminal, and information needed to decode the ES may be 
modified accordingly. In this case, a scene description 
that includes the information needed to decode the ES is 
modified accordingly. This relieves the receiving side of 
the necessity of sampling information, which is needed for 
decoding, from the ES. Furthermore, since a scene 
description that explicitly describes an ES to be used to 
construct a scene is transmitted to the receiving terminal, 
the receiving terminal can judge whether the ES is needed to 
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construct a scene, irrespective of a delay in arrival of the 
ES at the receiving terminal or a loss in data. Moreover, a 
bit rate at which a scene description is transmitted is 
controlled based on the state of the transmission line, 
whereby a delay in data transmission or a loss in data is 
prevented from occurring on the transmission line. Moreover, 
when the ability of the receiving terminal dynamically 
varies, the server modifies a scene description and then 
transmits the resultant scene description. Consequently, 
important part of a scene description is prevented from 
being discarded at the receiving terminal unintentionally to 
the server. When it says that a scene description is 
modified, it means that a scene description is selected from 
among a plurality of predefined scene descriptions and then 
output. Otherwise, a predefined scene description is 
received, and converted into a scene description that 
conforms to the state of a transmission line or the ability 
of a receiving terminal. Otherwise, scene description data 
is produced or encoded in conformity with the state of a 
transmission line or the ability of a receiving terminal, 
and then transmitted. 
[0028] 

[Description of the Embodiments] 

A preferred embodiment of the present invention will be 
described with reference to the drawings below. 
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[0029] 

Fig. 1 shows an example of the configuration of a data 
distribution system in accordance with the present 
embodiment. Compared with the conventional data 
distribution system shown in Fig, 20, the data distribution 
system in accordance with the present embodiment 
accommodates a server 10 that includes a scene description 
processing unit 2. Moreover, a receiving terminal 20 
includes a scene description decoding unit 23 that decodes a 
scene description received from the scene description 
processing unit 2 (that is, interprets the scene description 
to construct a scene) . Scene description processing to be 
performed by the scene description processing unit 2 will be 
detailed later. 

[0030] 

Referring to Fig. 1, an ES processing unit 3 included 
in the server 10 selects any of ESs stored in advance in a 
memory 4. Otherwise, the ES processing unit 3 receives a 
baseband image signal and acoustic signal, which are not 
shown, and encodes the signals to produce an ES. At this 
time, a plurality of ESs may be produced. A transmission 
control unit 5 included in the server 10 multiplexes the 
plurality of ESs if necessary, encodes a resultant ES 
according to a protocol according to which a signal is 
transmitted over a transmission medium 7, and transmits the 



- 37 - 



ES to the receiving terminal 20. 
[0031] 

A reception control unit 21 included in the receiving 
terminal 20 decodes the ES, which has been transmitted over 
the transmission medium 7, according to the protocol, and 
hands the resultant ES to an ES decoding unit 24. If ESs 
are multiplexed, the reception control unit 21 separates the 
ESs, and hands the ESs to associated ES decoding units 24. 
The ES decoding unit 24 decodes an ES to restore an image 
signal and an acoustic signal. The image signal and 
acoustic signal produced by the ES decoding unit 24 are sent 
to a scene description decoding unit 23. The scene 
description decoding unit 23 constructs a scene using the 
image signal and acoustic signal according to a scene 
description transmitted from the scene description 
processing unit 2 that will be described later. A signal 
representing the scene is transferred to a display sounding 
unit 25 composed of a television monitor and a loudspeaker. 
Consequently, an image expressing the scene is displayed on 
the television monitor, and sounds expressing the scene are 
radiated from the loudspeaker. 

[0032] 

The server 10 corresponds to a transmission system 
installed at a broadcasting station that provides a 
broadcasting service, or an Internet server or home server 
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that gives access to the Internet- The receiving terminal 
20 corresponds to a receiving apparatus for receiving a 
broadcast signal or a personal computer. The transmission 
medium 7 corresponds to a leased transmission line 
accommodated by a broadcasting system or a fast 
communication network included in the Internet. 
[0033] 

Moreover, the data distribution system in accordance 
with the present embodiment performs actions described below 
to overcome a drawback that a change in the bandwidth of a 
transmission line (transmission medium 7) over which an ES 
is transmitted or a change in the traffic- jammed state on 
the transmission line leads to a delay in data transmission 
or a loss in transmitted data. 

[0034] 

The server 10 (for example, the transmission control 
unit 5) assigns a (encoded) serial number to each packet in 
the form of which data is transmitted over a transmission 
line. The reception control unit 21 in the receiving 
terminal 20 monitors a packet received over the transmission 
line to see if a (encoded) serial number that should be 
assigned to each packet is missing, and thus detects a loss 
in data (a data loss rate). Otherwise, the server 10 (for 
example, the transmission control unit 5) appends (encoded) 
time instant information to data to be transmitted over the 
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transmission line. The reception control unit 21 in the 
receiving terminal 20 monitors data received over the 
transmission line to see if (encoded) time instant 
information is appended to the data, and thus detects a 
delay in transmission in terms of the time instant 
information. The reception control unit 21 in the receiving 
terminal 20 thus detects the data loss rate on the 
transmission line or the delay in transmission thereon. The 
reception control unit 21 then transmits (reports) the 
detected information to the data- transmitted state detector 
6 included in the server 10. 
[0035] 

With the information of the data loss rate that 
characterizes the transmission line or the delay in 
transmission occurring on the transmission line from the 
reception control unit 21 in the receiving terminal 20, the 
data-transmitted state detector 6 in the server 10 detects a 
bandwidth offered by the transmission line or the traffic- 
jammed state of the transmission line. In other words, if a 
data loss is large, the data- transmitted state detector 6 
judges that the transmission line is jammed, or if a delay 
in transmission is increased, the data- transmitted state 
detector 6 judges that the transmission line is jeimmed. If 
a transmission line of a bandwidth reservation type is 
adopted, the data- transmitted state detector 6 can detect an 
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available bandwidth usable by the server 10. If a 
transmission medium dependent on weather conditions such as 
radio waves is adopted, a user may designate a bandwidth in 
advance in accordance with the weather conditions or the 
like. The information of the data- transmitted state 
detected by the data- transmitted state detector 6 is sent to 
the conversion control unit 1 . 
[0036] 

Based on the detected information of the bandwidth of 
the transmission line or the traffic- jammed state thereof, 
the conversion control unit 1 controls the ES processing 
unit 3 so that the ES processing unit 3 will switch ESs 
which are transmitted at different bit rates. Otherwise, 
when the ES processing unit encodes an ES according to the 
ISO/IEC13818 (so-called MPEG2), the encoding rate is 
controlled. In other words, if it is detected that the 
transmission line is jeimmed, the ES processing unit 3 
transfers an ES that must be transmitted at a low bit rate. 
Consequently , a delay in data transmission can be avoided. 

[0037] 

Moreover, for example, an unspecified large number of 
receiving terminals 20 may be connected to the server 10, 
and the specifications for the receiving terminals 20 may 
not be uniform. Besides, the server 10 may have to transmit 
an ES to the receiving terminals 20 whose processing 
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abilities are different from one another. In this case, 
each of the receiving terminals 20 includes a transmission 
request processing unit 22, The transmission request 
processing unit 22 produces a transmission request signal 
with which an ES that conforms to the processing ability of 
the own receiving terminal 20 is requested. The 
transmission request signal is transmitted from the 
reception control unit 21 to the server 10. The 
transmission request signal includes a signal that expresses 
the ability of the own receiving terminal 2. The signal 
that expresses the ability of the own receiving terminal 2 
and that is transferred from the transmission request 
processing unit 22 to the server 10 is a signal representing, 
for example, a memory size, a resolution offered by the 
display unit, an arithmetic capability, a buffer size, an ES 
encoding format that permits decoding, the number of 
decodable ESs, or a bit rate for a decodable ES. In 
response to the transmission request signal, the conversion 
control unit 1 in the server 10 controls the ES processing 
unit 3 so that the ES processing unit 3 will transmit an ES 
that conforms to the performance of the receiving terminal 
20. Talking of image signal conversion for converting an ES 
into another ES that conforms to the performance of the 
receiving terminal 20, for example, an image signal 
converting method the present applicant has already proposed 
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may be adopted. 
[0038] 

The aforesaid components and actions are identical to 
those of the example shown in Fig. 20. In the data 
distribution system of the present embodiment, the 
conversion control unit 1 in the server 10 controls not only 
the ES processing unit 3 but also the scene description 
processing unit 2 according to the state of the transmission 
line detected by the data-transmitted state detector 6. 
Moreover, if the receiving terminal 2 0 is a receiving 
terminal that requests a scene description which conforms to 
the decoding and display abilities thereof, the conversion 
control unit 1 in the server 10 controls the ES processing 
unit 3 and scene description processing unit 2 according to 
a signal that expresses the ability of the receiving 
terminal and that is sent from the transmission request 
processing unit 22 in the receiving terminal 20. In other 
words, the scene description processing unit 3 employed in 
the present embodiment performs five kinds of scene 
description processing of first to fifth scene description 
processing, which will be described below, under the control 
of the conversion control unit 1 . 

[0039] 

The first to fifth scene description processing 
employed in the present embodiment will be described below - 
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[0040] 

To begin with, the first scene description processing 
will be described. The server 10 employed in the present 
embodiment can transfer a scene description suitable for an 
ES produced by the ES processing unit 3. In other words, 
the scene description processing unit 3 employed in the 
present embodiment can produce a scene description, which is 
suitable for an ES produced by the ES processing unit 3, 
under the control of the conversion control unit 1 . The 
first scene description processing will be described 
concretely in conjunction with Fig. 2 to Fig. 6. 

[0041] 

Fig. 2 shows an example of displaying a scene 
constructed using a motion picture ES and a still image ES. 
Referring to Fig. 2, there is shown a scene display field 
Esi. A motion picture ES display field Emv is contained in 
the scene display field Esi, and a still image ES display 
field Esv is also contained in the scene display field Esi. 

[0042] 

Fig. 3 shows a scene description that describes the 
construction of a scene to be displayed in the scene display 
field Esi- The scene description is written in compliance 
with the MPEG- 4 BIFS. The adoption of the VRML results in a 
scene description written as text data, while the adoption 
of the MPEG- 4 BIFS results in a scene description written as 
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binary- encoded text data. If the scene description shown in 
Fig, 2 is written in compliance with the MPEG- 4 BIFS, the 
scene description is binary-coded in reality. However, Fig. 
3 shows the scene description written in the form of text 
for a better understanding. A method of writing a scene 
description in compliance with the MPEG- 4 BIFS is stipulated 
in the ISO/IEC14496- 1 , and the description of the method 
will therefore be omitted. 
[0043] 

A scene description written in compliance with the 
MPEG- 4 BIFS (or VRML) is expressed using a basic description 
unit that is referred to as a node. Referring to Fig. 3, a 
node is written with bold characters. The node is a unit 
that describes an object to be displayed or a connection 
between objects, and contains data that is referred to as a 
field and that expresses the property or attribute of the 
node. For example, a node "Transform" in Fig. 3 is a node 
that specifies three-dimensional coordinate transformation. 
A field "translation" subordinate to the node "Transform" 
specifies a magnitude of parallel movement of an origin in a 
coordinate plane. Moreover, some fields point out other 
nodes. For example, the node "Transform" in Fig. 3 contains 
a field "children" that specifies a group of child nodes. 
The child nodes specify an object to be subjected to 
coordinate transformation. For example, a node "Shape" and 
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others are grouped into the field "children." In order to 
arrange objects to be displayed in a scene image, a node 
that specifies an object and nodes that specify the 
attributes of the object are grouped together, and grouped 
under a node that specifies a position at which the object 
should be located. For example, an object specified in a 
node "Shape" in Fig. 3 is subjected to parallel movement as 
specified in the parent node "Transform" and then arranged 
in a scene. Moreover, video data and audio data are 
arranged spatially and temporally according to a scene 
description, and then made visible and audible. For example, 
a node "MovieTexture" in Fig. 3 specifies that a cube is 
displayed with a motion picture identified with an 
identification (ID) number of 3 pasted to the surface 
thereof . 

[0044] 

The scene description shown in Fig. 3 describes that a 
scene contains two cubes and that a motion picture and a 
still image are pasted to the surfaces of the cubes in order 
to express the textures of the surfaces. Coordinate 
transformation is specified for each of the objects in the 
node "Transform. " The object is moved in parallel according 
to a value specified in a field "translation" indicated with 
#500 or #502 in Fig. 3 (an origin in a local coordinate 
plane) . Moreover, enlargement or reduction of the object 



- 46 - 



specified in the node "Transform" is specified with a value 
indicated with #501 or #503 (scaling of a local coordinate 
plane) . 

[0045] 

For example, assume that a bit rate at which data is 
transmitted must be lowered due to the state of a 
transmission line or a request issued from a receiving 
terminal. In this case, for example, a motion picture ES is 
modified in order to lower a bit rate for the motion picture 
ES. This because when it says that a motion picture ES is 
transmitted, it means that a large amount of data must be 
transmitted. Incidentally, at this time, for example, a 
high-resolution still image ES has already been transmitted 
and stored in the receiving terminal. 

[0046] 

In this case, the conventional data distribution system 
decodes an ES according to the same scene construction 
irrespective of whether a bit rate for the ES has been 
controlled, and then displays an image using the resultant 
ES. Therefore, when a motion picture is displayed based on 
the motion picture ES for which a bit rate has been lowered, 
poor image quality or the like becomes distinctive. Taking 
the example shown in Fig. 2, a description will be made 
concretely below. Specifically, in the conventional data 
distribution system, even when a bit rate for a motion 
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picture ES based on which a motion picture is displayed in 
the motion picture ES display field Emv in Fig. 2 is lowered 
or anyhow controlled, the ES is decoded according to the 
same scene construction as an ES for which a bit rate is not 
controlled is . A motion picture is then displayed based on 
the resultant ES. In other words, the motion picture ES is 
decoded so that the motion picture will be displayed while 
occupying the entire motion picture ES display field Emv 
that is too wide for an actual bit rate. Consequently, the 
motion picture displayed based on the motion picture ES 
appears rough (for example, appears to exhibit a low spatial 
resolution). Poor image quality is distinctive. 
[0047] 

In contrast, when a bit rate for a motion picture ES is 
lowered, the motion picture ES display field Emv may be 
narrowed as shown in Fig. 4. In this case, poor image 
quality of a motion picture displayed in the motion picture 
ES display field Emv (in this case, a low spatial 
resolution) may become indistinctive. Moreover, according 
to the present embodiment, a still image ES is already 
transmitted and stored in the receiving terminal. If a 
still image represented by the still image ES is, for 
example, a high-resolution image, the still image ES display 
field Esv in Fig. 2 may be too narrow for the resolution. 
In this case, the still image ES display field Esv may be 
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made wider as shown in Fig. 4. Thus, the high resolution of 
the still image can be fully utilized. In order to thus 
narrow the motion picture ES display field Em or widen the 
still image ES display field Esv, a scene description must 
be modified to describe such a scene construction. 
[0048] 

The scene description processing unit 3 employed in the 
present embodiment dynamically modifies a scene description 
according to whether the ES processing unit 3 has controlled 
a bit rate for an ES. In other words, when the conversion 
control unit 1 in the server 10 employed in the present 
embodiment instructs the ES processing unit 3 to control a 
bit rate for an ES, the conversion control unit 1 also 
instructs the scene description processing unit 2 to produce 
a scene description suitable for the ES to be transferred 
from the ES processing unit 3. Consequently, according to 
the present embodiment, even when a bit rate for a motion 
picture is lowered, deteriorated image quality is 
indistinctive. According to the present embodiment, the 
motion picture ES display field Emv is narrowed as shown in 
Fig. 4, while the still image ES display field Esv is 
widened in order to make the most of the high resolution of 
a still image whose signal has already been transmitted. 

[0049] 

Referring to Fig. 5, a description will be made of 
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concrete actions to be performed by the conversion control 
unit 1 in order to implement the above feature. 
[0050] 

If a bit rate at which data is transmitted must be 
lowered due to the state of a transmission line or a request 
issued from a receiving terminal, the conversion control 
unit 1 controls the ES processing unit 3 so that the ES 
processing unit will produce a motion picture ES 203, which 
will be transmitted at a lower bit rate than a motion 
picture ES 202 is, at a time instant T in Fig. 5. 

[0051] 

Moreover, the conversion control unit 1 controls the 
scene description processing unit 2 so that the scene 
description processing unit 2 will convert a scene 
description 200 corresponding to the scene display field Esi 
shown in Fig. 2 into a scene description 201 corresponding 
to the scene display field Esi shown in Fig. 4. 
Specifically, under the control by the conversion control 
unit 1, the scene description processing unit 2 converts the 
scene description, which is shown in Fig. 3 and appears in 
the scene display field Esi shown in Fig. 2, into the scene 
description which is shown in Fig. 6 and which appears in 
the scene display field Esi shown in Fig. 4. The scene 
description shown in Fig. 6 is, like the one shown in Fig. 3, 
a text version of an actual scene description written in 
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compliance with the MPEG- 4 BIFS. 
[0052] 

Compared with the scene description shown in Fig. 3, in 
the scene description shown in Fig. 6, values specified in 
the fields "translation" indicated with #600 and #602 in the 
drawing (origins in the local coordinate) are different from 
the values specified in the scene description shown in Fig, 
3, whereby two cubes are moved. One of the cubes having a 
motion picture (displayed in the field Emv in Fig. 4) pasted 
to the surface thereof is converted to a smaller cube 
according to the value specified in the field indicated with 
#601. The other cube having a still image (displayed in the 
field Esv in Fig. 4) pasted to the surface thereof is 
converted into a larger cube according to a value specified 
in the field indicated with #603. 

[0053] 

For example, the conversion of the scene description 
shown in Fig. 3 into the scene description shown in Fig. 6, 
which is performed during the aforesaid first scene 
description processing, is realized with any of actions 
performed by the scene description processing unit 2 as 
described below. Namely, a scene description (the scene 
description shown in Fig. 6) suitable for an ES produced by 
the ES processing unit 3 is selected from among a plurality 
of scene descriptions stored in advance in the memory 4, and 
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then transmitted. Otherwise, a scene description (the scene 
description shown in Fig, 3) read from the memory 4 is 
converted into a scene description (the scene description 
shown in Fig. 6) suitable for an ES produced by the ES 
processing unit 3, and then transmitted. Otherwise, a scene 
description (the scene description shown in Fig. 6) suitable 
for an ES produced by the ES processing unit 3 is produced 
or encoded and then transmitted. When a scene description 
form permits description of a portion of a scene description 
that must be modified, the portion alone may be modified and 
transmitted. In the aforesaid example, when a bit rate for 
a motion picture ES is lowered, the motion picture ES 
display field Emv is narrowed. In contrast, when a bit rate 
is raised, the motion picture ES display field Emv may be 
widened. Even to this case, the feature of the present 
invention for modifying the scene description can be adapted. 
Furthermore, in the aforesaid example, a still image ES that 
represents a high resolution is transmitted in advance. For 
example, when a still image whose signal has already been 
transmitted and stored exhibits a low resolution, a high- 
resolution still image ES may be newly transmitted, and a 
scene description suitable for the still image ES may be 
transmitted. According to the present embodiment, a motion 
picture and a still image are taken for instance. The 
present invention is also applied to a case where a scene 
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description is modified because a bit rate for other 
multimedia data has been controlled. 
[0054] 

According to the first scene description processing 
described in conjunction with Fig. 2 to Fig. 6, a scene 
description that is data describing a scene construction is 
modified. Consequently, a scene description that conforms 
to the state of a transmission line or a request issued from 
a decoding terminal can be transmitted. Moreover, when the 
ES processing unit 3 modifies an ES, a scene description 
suitable for the resultant ES can be transmitted. 

[0055] 

Next, second scene description processing will be 
described below. 
[0056] 

For example, when the ES processing unit 3 changes a 
bit rate for an ES according to the state of a transmission 
line or the state of the receiving terminal 20, information 
needed to decode the ES may be modified. In this case, the 
server 10 employed in the present embodiment converts a 
scene description which includes the information needed to 
decode the ES and transmits the resultant scene description. 
The conversion and transmission are performed as second 
scene description processing. This relieves a receiving 
terminal of the necessity of sampling information, which is 
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needed to decode an ES, from the ES itself, though the 
receiving terminal in the conventional data distribution 
system has to perform the sampling. Specifically, when 
information needed to decode an ES is modified because the 
ES processing unit 3 has modified the ES, the scene 
description processing unit 3 employed in the present 
embodiment produces a scene description that includes the 
information needed to decode the ES, under the control by 
the conversion control unit 1. Incidentally, information 
needed to decode an ES includes, for example, an ES encoding 
format, a buffer size required for decoding, and a bit rate. 
Referring to the drawings referred to previously as well as 
Fig. 7 and Fig. 8, the second scene description processing 
will be described concretely below. 
[0057] 

Fig. 7 shows an example of information needed to decode 
an ES that is used to display a scene like the one described 
in conjunction with Fig. 2 and Fig. 3, and that is described 
in a descriptor "Ob jectDescriptor" stipulated in the MPEG-4. 
In the scene description shown in Fig. 3, the motion picture 
to be mapped to the surface of the object in order to 
express the texture of the surface is specified with a value 
of 3 (=url3). The value corresponds to the value of an 
identifier (0Did=3) subordinate to the descriptor 
"Ob jectDescriptor " shown in Fig. 7. A descriptor 
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"ES_Descriptor" subordinate to the "Ob jectDescriptor " 
concerning the object identified with the identifier 
"0Did=3" describes information concerning an ES. Moreover, 
"ES_ID" in Fig. 7 is an identifier unique to an ES. The 
identifier "ES_ID" is related to an identifier of a header 
or a port number that is appended to an ES as defined in a 
protocol adopted for transmission of an ES, and thus 
associated with an actual ES . 
[0058] 

Moreover, the descriptor "ES_Descriptor " contains a 
descriptor "DecoderConf igDescriptor " that describes 
information needed to decode an ES. The information 
described in the descriptor "DecoderConf igDescriptor" 
includes, for example, a buffer size needed to decode an ES, 
a maximum bit rate, and an average bit rate, 

[0059] 

Fig. 8 shows an example of information that is needed 
to decode an ES and that is appended to a scene description 
that has been modified by the scene description processing 
unit 2, corresponding to the scene shown in Fig. 4, and that 
is described using a descriptor "Ob jectDescriptor " 
stipulated in the MPEG- 4. As to the motion picture ES 
modified by ES conversion (referred from the scene 
description because the identifier ODid specifies 3), a 
decoding buffer size specified in "buf f erSizeDB , " a meiximum 
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bit rate specified in "maxBitRate , " and an average bit rate 
specified in "avgBitRate" which are shown in Fig. 7 are 
changed to those described in the descriptor 

"ObJectDescriptor" shown in Fig. 8. In other words, in the 
example shown in Fig. 7, 4000 is specified in 
"buf f erSizeDB, " 1000000 is specified in "maxBitRate," and 
1000000 is specified in "avgBitRate." Referring to Fig. 8, 
2000 is specified in "buf f erSizeDB , " 5000000 is specified in 
"maxBitRate," and 5000000 is specified in "avgBitRate." 
[0060] 

The modification of information needed to decode an ES 
and appended to a scene description, which is performed 
during the second scene description processing, is realized 
with any of actions performed by the scene description 
processing unit 2 as described below. Namely, information 
associated with an ES produced by the ES processing unit 3 
{information shown in Fig. 8) is selected from among a 
plurality of information items needed to decode ESs , and 
then transmitted. Herein, the plurality of information 
items needed to decode ESs is stored in the memory 4 in 
advance. Otherwise, information needed to decode an ES 
(information shown in Fig. 7) is read from the memory 4, 
converted into information needed to decode an ES produced 
by the ES processing unit 3, and then transmitted. 
Otherwise, information needed to decode an ES produced by 
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the ES processing unit 3 is encoded and then transmitted . 
[0061] 

When a bit rate for an ES is changed in conformity with 
the state of the transmission line or the state of the 
receiving terminal 20, information needed to decode the ES 
is modified- In this case, according to the aforesaid 
second scene description processing, information needed to 
decode an ES and appended to a scene description is modified 
as shown in Fig. 8, and transmitted to the receiving 
terminal 20. This relieves the receiving terminal 20 of the 
necessity of sampling information needed to decode an ES 
from the ES. 

[0062] 

Next, third scene description processing will be 
described below. 
[0063] 

During the third scene description processing, the 
server 10 employed in the present embodiment explicitly 
modifies a scene description to increase or decrease the 
number of ESs used to construct a scene, and transfers the 
resultant scene description. Consequently, only an ES whose 
frequency falls within the bandwidth of a transmission line 
is transmitted- On the other hand, irrespective of a delay 
in arrival of an ES or a loss in data, the receiving 
terminal 20 judges whether an ES is needed to display a 
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scene. Specifically, the scene description processing unit 
3 included in the server 10 employed in the present 
embodiment explicitly modifies a scene description to 
increase or decrease the number of ESs under the control of 
the conversion control unit 1, and transfers the resultant 
scene description. Irrespective of a delay in arrival of an 
ES or a loss in data, the scene description decoding unit 23 
included in the receiving terminal 20 judges whether an ES 
is needed to display a scene. The third scene description 
processing will be described concretely in conjunction with 
the drawings referred to previously as well as Fig. 9 and 
Fig. 10. 

[0064] 

Fig. 9 shows a scene description that is devoid of, for 
example, the description of the motion picture ES which is 
included in the scene description described in conjunction 
with Fig. 2 and Fig. 3, and that is written in compliance 
with the MPEG- 4 BIFS (a text version). Fig. 10 shows an 
example of a scene displayed based on the scene description 
shown in Fig. 9. A scene display field Esi contains only an 
image ES display field (for example, a still image ES 
display field) Eim. It can be judged from the scene 
description shown in Fig. 9 that only an ES described in the 
scene description is an ES identified with the value of 4 
specified in the identifier "ODid." Even if a motion 
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picture ES identified with the value of 3 specified in the 
identifier "ODid" does not arrive, the receiving terminal 20 
can judge that it does not attribute to a delay in arrival 
of an ES or a loss in data. Since the descriptor 
"Ob jectDescriptor" concerning an ES identified with the 
value of 3 in "ODid" like the one shown in Fig. 7 or Fig. 8 
is deleted, it can be judged that a motion picture ES 
identified with the value of 3 specified in "ODid" is no 
longer needed. 
[0065] 

During the third scene description processing, the 
receiving terminal 20 may issue a transmission request 
saying that it wants to have a processing load, which it 
must incur to decode scene data so as to construct a scene, 
reduced temporarily. In this case, the server 10 converts a 
scene description, for example, the one shown in Fig. 3 into 
the scene description shown in Fig. 9. Consequently, the 
receiving terminal 20 is explicitly informed of the fact 
that a motion picture need not be mapped into another object 
in a scene in order to express the texture of the object. 
This leads to a reduction in the processing load in the 
receiving terminal 20 for decoding scene data. 

[0066] 

The conversion of the scene description shown in Fig. 3 
into the scene description shown in Fig. 9 which is 
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performed during the third scene description processing is 
realized with any of actions performed by the scene 
description processing unit 2 as described below. 
Specif ically , a scene description (scene description shown 
in Fig. 9) associated with the number of ESs produced by the 
ES processing unit 3 is selected from among a plurality of 
scene descriptions stored in advance in the memory 4, and 
then transmitted. Otherwise, a scene description is read 
from the memory 4 , and converted into a scene description 
(scene description shown in Fig. 9) devoid of part data 
(contained in the scene description) that describes an ES 
which will not be transferred. The resultant scene 
description is then transmitted. Otherwise, when a scene 
description is encoded and output, part of the scene 
description that describes an ES which will not be 
transferred is not encoded. 
[0067] 

As described so far, according to the related art, a 
scene description cannot be modified. When a processing 
load a receiving terminal must incur exceeds the processing 
ability of the receiving terminal , part of scene data may be 
lost unexpectedly, or display of a scene may be delayed. 
According to the third scene description processing employed 
in the present embodiment, a scene description is modified 
as mentioned above. Consequently, the receiving terminal 20 
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can restore a scene as intended by the server 10 at an 
intended timing. Moreover, according to the third scene 
description processing, the scene description processing 
unit 2 can delete part data of a scene description in 
ascending order of importance until the processing load 
conforms to the processing ability of the receiving terminal 
20 or until the frequency of a signal representing the scene 
description falls within the bandwidth of a transmission 
line. Moreover, according to the third scene description 
processing, when the processing ability of the receiving 
terminal 20 has room for a heavier load, a more detailed 
scene description can be transmitted. Consequently, scene 
data suitable for the processing ability of the receiving 
terminal 20 can be decoded, and a scene can be displayed 
based on the scene data. 
[0068] 

Next, fourth scene description processing will be 
described below. 
[0069] 

During the fourth scene description processing, the 
server 10 employed in the present embodiment modifies the 
complexity of a scene description according to the state of 
a transmission line or a request issued the receiving 
terminal 20. Thus, the amount of data of a scene 
description is adjusted, and the processing load the 
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receiving terminal 20 incurs is adjusted. Specifically, the 
scene description processing unit 3 employed in the present 
embodiment adjusts the amount of data of a scene description 
in conformity with the state of a transmission line and a 
request issued from the receiving terminal 20 under the 
control of the conversion control unit 1, and then transmits 
the resultant scene description. The fourth scene 
description processing will be described concretely in 
conjunction with Fig. 11 to Fig. 14 below. 
[0070] 

Fig. 11 shows a scene description that describes the 
construction of a scene which contains an object described 
as a polygon, and that is written in compliance with the 
MPEG- 4 BIFS (a text version for a better understanding). 
For brevity's sake, coordinates representing the position of 
the polygon are omitted from the example of Fig. 11. In the 
scene description shown in Fig. 11, "IndexedFaceSet " 
describes a geometric object constructed by linking apexes, 
whose coordinates are specified in "point" subordinate to 
"Coordinate," as orderly as specified in "Coordlndex. " 
Moreover, Fig, 12 shows an example of display of a scene 
achieved by decoding the scene description shown in Fig. 11 
(an example of display of an object described as a polygon). 

[0071] 

During the fourth scene description processing, an 
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amount of data to be transmitted from the server 10 may have 
to be reduced due to the state of a transmission line, or a 
transmission request saying that the processing load must be 
reduced may be transmitted from the receiving terminal 20. 
In this case, the scene description processing unit 2 
included in the server 10 converts a scene description into 
a simpler scene description. For exeimple, the scene 
description in which " IndexedFaceSet " describes the polygon 
shown in Fig. 12 is converted into a scene description which 
is shown in Fig. 13 and in which "Sphere" describes a sphere 
like the one shown in Fig. 14. Consequently, the amount of 
data of the scene description itself is reduced, and the 
load the receiving terminal 20 incurs for decoding an ES and 
constructing a scene is lightened. In the case of the 
polygon shown in Fig. 12, values must be specified in order 
to express a polygon. In contrast, in the case of the 
sphere shown in Fig. 14, the values need not be specified. 
Therefore, the amount of data of the scene description that 
describes the construction of a scene can be reduced. 
Moreover, the complex processing of displaying a polygon 
that is performed by the receiving terminal 20 is changed to 
the simpler processing of displaying a sphere . The 
processing load the receiving terminal 20 incurs is thus 
lightened. 
[0072] 
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The conversion of the scene description shown in Fig. 
11 into the scene description shown in Fig. 13 which is 
performed during the fourth scene description processing is 
realized with any of actions performed by the scene 
description processing unit 2 as described below. 
Specifically, a scene description that meets a criterion 
defined based on the state of a transmission line or a 
request issued from the receiving terminal 20 is selected 
from among a plurality of scene descriptions stored in 
advance in the memory 4, and then transmitted. Otherwise, a 
scene description is read from the memory 4 , and converted 
into a scene description that meets the criterion. 
Otherwise, a scene description that meets the criterion is 
encoded and then transmitted. What is referred to as the 
criterion is a criterion that implies the complexity of a 
scene description, such as, the amount of data of a scene 
description, the number of nodes, or the number of polygons. 

[0073] 

Moreover, other methods of converting the complexity of 
a scene description which may be implemented in the scene 
description processing unit 2 will be described below. 
Namely, complex part data of a scene description may be 
replaced with simpler data like the one shown in Fig, 13 or 
the reverse process thereof. Otherwise, part data of a 
scene description is removed or the reverse process thereof. 
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When a scene description is encoded, a quantization step is 
modified in order to adjust the amount of data of a scene 
description. When it says that the quantization step of 
encoding is modified in order to adjust the amount of data 
of a scene description, for example, the number of bits to 
be quantized is decreased- This results in a decrease in 
the amount of data of a scene description. Incidentally, 
the MPEG-4 BIFS stipulates that a quantization parameter 
indicating whether quantization is adopted or not or the 
number of bits employed can be set for each quantization 
category, that is, coordinates, an inclination of an axis of 
rotation, or a size. Moreover, the quantization parameter 
can be changed within one scene description. 
[0074] 

As described so far, according to the related art, a 
scene description cannot be modified. Therefore, when a 
processing load a receiving terminal must incur exceeds the 
processing ability of the receiving terminal, there is a 
fear that part of scene data may be lost unexpectedly. When 
the bandwidth of a transmission line is insufficient, there 
is a fear that part of data to be transmitted may be lost 
unexpectedly. According to the fourth scene description 
processing employed in the present embodiment, a scene 
description is modified so that a scene simplified as 
intended by the server 10 can be restored at the receiving 
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terminal 20. Moreover, according to the fourth scene 
description processing, the scene description processing 
unit 2 can delete part data of a scene description in 
ascending order of importance until the frequency of a 
signal representing the scene description falls within the 
bandwidth of the transmission line or until the processing 
load the receiving terminal 20 must incur conforms to the 
processing ability of the receiving terminal 20. 
[0075] 

Next, fifth scene description processing will be 
described below. 
[0076] 

During the fifth scene description processing, the 
server 10 employed in the present embodiment divides a scene 
description into a plurality of decoding units in conformity 
with the state of a transmission line or a request issued 
from the receiving terminal 20. A bit rate for a scene 
description is adjusted, and local concentration of a 
processing load the receiving terminal 20 must incur is 
avoided. Specifically, the scene description processing 
unit 3 in accordance with the present embodiment divides a 
scene description into a plurality of decoding units in 
conformity with the state of the transmission line or the 
request issued from the receiving terminal 20 under the 
control of the conversion control unit 1 . The scene 
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description processing unit 3 transmits the scene 
description while adjusting the timing of transmitting each 
of the decoding units constituting the scene description. A 
decoding unit of the scene description that should be 
decoded at a certain time instant shall be referred to as an 
access unit (hereinafter, AU) . Referring to Fig. 15 to Fig. 
19, the fourth scene description processing will be 
concretely described below. 
[00771 

Fig. 15 shows a scene description that includes one AU, 
that describes the construction of a scene composed of four 
objects, for example, a sphere, a cube, a cone, and a 
cylinder, and that is written in compliance with the MPEG- 4 
BIFS. Fig. 16 shows an example of the scene displayed by 
decoding the scene description shown in Fig. 15. Referring 
to Fig. 16, the four objects of a sphere 41, a cube 42, a 
cone 44, and a cylinder 4 3 are displayed. The data 
representing the scene whose construction is described in 
one AU shown in Fig. 15 must be entirely decoded at a 
designated decoding time instant and reflected on display at 
a designated display time instant. The decoding time 
instant (time instant at which the AU should be decoded and 
validated) is termed a decoding time stamp (DTS) in the 
MPEG- 4 . 

[00781 
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During the fifth scene description processing, a bit 
rate for data to be transmitted may have to be lowered due 
to the state of a transmission line or a request issued from 
the receiving terminal 20. Otherwise, local concentration 
of a processing unit the receiving terminal 20 must incur 
may have to be reduced. In this case, the scene description 
processing unit 2 in the server 10 divides a scene 
description into a plurality of AUs , and allocates different 
DTSs to the AUs. Consequently, a bit rate for part of a 
scene description is converted into a bit rate that conforms 
to the state of the transmission line or the request issued 
from the receiving terminal 20. A throughput required for 
decoding part of the scene description at each DTS is 
converted into a throughput that conforms to the request 
issued from the receiving terminal 20. 

[0079] 

Specifically, the scene description processing unit 2 
divides, for example, the scene description shown in Fig. 15 
into four AUs AUl to AU4 as shown in Fig. 17. The first AU 
AUl describes that an identification (hereinafter ID) number 
of 1 is assigned to a node "Group" that specifies grouping. 
The first AU AUl is therefore referenced by subsequent AUs. 
According to the MPEG- 4 BIFS, a part scene description can 
be added to the grouping node that can be referenced. The 
second AU AU2 to fourth AU AU4 describe a command that 
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instructs addition of a part scene description to a field 
"children" subordinate to the node "Group" to which the ID 
number of 1 is assigned in the first AU AUl . 
[0080] 

The scene description processing unit 2 designates, as 
shown in Fig. 18, different DTSs for the first AU AUl to the 
fourth AU AU4. Specifically, a first DTS DTSl is designated 
for the first AU AUl, a second DTS DTS2 is designated for 
the second AU AU2 , a third DTS DTSS is designated for the 
third AU AU3 , and a fourth DTS DTS4 is designated for the 
fourth AU AU4. Consequently, a bit rate at which part of a 
scene description is transmitted from the server 10 to the 
receiving terminal 20 is lowered. Moreover, a load the 
receiving terminal 20 must incur for decoding part data at 
each DTS is reduced. 

[0081] 

A scene to be displayed by decoding the four AUs, into 
which the scene description is divided as shown in Fig. 17, 
at the DTSs DTSl to DTS4 has, as shown in Fig. 19, an object 
added thereto at each DTS. At the last DTS DTS4, the same 
scene as that shown in Fig. 16 is completed. Specifically, 
the sphere 41 is displayed at the first DTS DTSl, the cube 
42 is added at the second DTS DTS2, the cone 44 is added at 
the third DTS DTS3, and the cylinder 43 is added at the 
fourth DTS DTS4 . Eventually, the four objects are displayed. 
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[0082] 

The conversion of the scene description shown in Fig. 
15 into the scene description shown in Fig. 17 which is 
performed during the fifth scene description processing is 
realized by any of actions performed by the scene 
description processing unit 2 as described below. Neimely, a 
scene description that meets a criterion dependent on the 
state of a transmission line or a request issued from the 
receiving terminal 20 is selected from among a plurality of 
scene descriptions stored in advance in the memory 4, and 
then transmitted. Otherwise, a scene description is read 
from the memory 4 and converted into a scene description 
that is divided into portions (AUs AUl to AU4) until each 
portion meets the criterion. Otherwise, the scene 
description that is divided into portions (AUs AUl to AU4) 
until each portion meets the criterion is encoded in units 
of the portion and then transferred. The criterion employed 
in the fifth scene description processing may be the amount 
of data of one AU, the number of nodes contained in one AU, 
the number of objects described in one AU, the number of 
polygons described in one AU, or any other criterion 
expressing a limit relevant to one AU of a scene description. 

[0083] 

As described so far, according to the fifth scene 
description processing, a scene description is divided into 
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a plurality of AUs , and a time interval between DTSs 
allocated to AUs is adjusted. Thus, an average bit rate for 
a scene description is controlled- Also, the receiving 
terminal 20 can be reduced in the local load of decoding 
process. Incidentally, the average bit rate is calculated 
by dividing the sum of amounts of data of AUs to which DTSs 
within a certain period of time are allocated, by the period 
of time. The scene description processing unit 2 adjusts 
the time interval between DTSs so as to realize an average 
bit rate that conforms to the state of a transmission line 
or a request issued from the receiving terminal 20. In the 
aforesaid example, a scene description is divided into AUs. 
On the contrary, a plurality of AUs may be integrated into 
one unit . 

[0084] 

In the above description, a case where the first scene 
description processing to the fifth scene description 
processing are performed independently of one another. Some 
of the scene description processing may be combined in order 
to perform a plurality of kinds of scene description 
processing concurrently. In this case, the aforesaid 
operations and advantages of the combined kinds of scene 
description processing are implemented simultaneously . 

[0085] 

Moreover, according to the present embodiment, a scene 
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description written in compliance with the MPEG- 4 BIFS is 
taken for instance. The present invention is not limited to 
the MPEG- 4 BIFS but can be applied to any scene description 
form. For excimple, a scene description form enabling 
description of a portion of a scene description that must be 
modified may be adopted. In this case, the present 
invention can be applied to transmission of the modified 
portion alone. 
[0086] 

Furthermore, the present embodiment may be implemented 
in hardware or software. 
[0087] 
[Advantages ] 

According to the present invention, a scene description 
that conforms to the state of a transmission line and/or a 
request issued from a receiving side is produced. Thus, 
scene description data that conforms to the state of the 
transmission line or the processing ability of the receiving 
side can be transmitted to the receiving side. Consequently, 
occurrence of a drawback such as unexpected missing of part 
of a scene that is unintended to a transmitting side can be 
avoided- The unexpected missing results from a loss 
occurring on the transmission line or the insufficient 
processing ability of the receiving side. Even when a 
transmission rate at which a signal is transmitted to the 
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receiving side is changed, the receiving side can decode 
data according to a scene construction that conforms to the 
transmission rate. Furthermore, a change in information 
needed to decode data can be explicitly reported to the 
receiving side. The receiving side is therefore relieved of 
the necessity of sampling the information necessary for 
decoding from the data represented by the signal. 
[Brief Description of the Drawings] 
[Fig. 1] 

Fig. 1 is a block diagram showing the outline 
configuration of a data distribution system in accordance 
with an embodiment of the present invention. 
[Fig. 2] 

Fig. 2 shows a result of scene display performed based 
on a scene description that has not been modified during 
first scene description processing employed in the present 
embodiment . 
[Fig. 3] 

Fig. 3 shows an example of a scene description (written 
in compliance with the MPEG- 4 BIFS) describing the 
construction of a scene shown in Fig. 2. 
[Fig. 4] 

Fig. 4 shows a result of scene display performed based 
on a scene description that has been modified during the 
first scene description processing employed in the present 
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embodiment . 
[Fig. 5] 

Fig. 5 is an explanatory diagram used to explain the 
timing of modifying ESs and the timing of modifying a scene 
description during the first scene description processing 
employed in the present embodiment. 
[Fig. 6] 

Fig. 6 shows an exeimple of a scene description (written 
in compliance with the MPEG- 4 BIFS) describing the 
construction of the scene shown in Fig. 4. 
[Fig. 7] 

Fig. 7 shows an example of information (described in 
ObJectDescriptor stipulated in the MPEG-4) that is appended 
to the scene description shown in Fig. 3 and that is needed 
to decode ESs that are used to construct the scene shown in 
Fig. 2. 
[Fig. 8] 

Fig. 8 shows an example of information (described in 
ObJectDescriptor stipulated in the MPEG-4) that is appended 
to the scene description shown in Fig. 6 and that is needed 
to decode ESs which are used to construct the scene shown in 
Fig. 4. 
[Fig. 9] 

Fig. 9 shows an example of a scene description (written 
in compliance with the MPEG-4 BIFS) describing the 
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construction of a scene different from the scene described 
in conjunction with Fig. 2 and Fig. 3 in a point that a 
motion picture ES is unused. 
[Fig. 10] 

Fig. 10 shows a result of display performed based on 
the scene description shown in Fig. 9. 
[Fig. 11] 

Fig. 11 shows an example of a scene description 
(written in compliance with the MPEG- 4 BIFS) according to 
which an object described as a polygon is displayed. 
[Fig. 12] 

Fig. 12 shows an example of a scene description 
(written in compliance with the MPEG- 4 BIFS) according to 
which a sphere is substituted for an object described as a 
polygon . 
[Fig. 13] 

Fig. 13 shows a result of display performed based on 
the scene description shown in Fig. 11. 
[Fig. 14] 

Fig. 14 shows a result of display performed based on 
the scene description shown in Fig. 12. 
[Fig. 15] 

Fig. 15 shows an example of a scene description 
(written in compliance with the MPEG- 4 BIFS) describing the 
construction of a scene composed of four objects. 
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[Fig. 16] 

Fig. 16 shows a result of display performed according 
to the scene description shown in Fig. 15. 
[Fig. 17] 

Fig. 17 shows an example of four AUs (written in 
compliance with the MPEG- 4 BIFS) into which the scene 
description shown in Fig. 15 is divided. 
[Fig. 18] 

Fig. 18 is an explanatory diagram used to explain the 
timing of decoding each AU shown in Fig. 17. 
[Fig. 19] 

Fig. 19 shows a result of display performed based on 
the scene description composed of the AUs shown in Fig. 17. 
[Fig. 20] 

Fig, 20 is a block diagram showing the outline 
configuration of a conventional data distribution system. 
[Reference Numerals] 

1: conversion control unit 
2: scene description unit 
3: ES processing unit 
4 : memoiry 

5: transmission control unit 

6: data-transmitted state detector 

7: transmission medium 

10: server 
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20: receiving terminal 

21: reception control unit 

22: transmission request processing unit 

23: scene description decoding unit 

24: ES decoding unit 

25: display sounding unit 
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[Name of Document] ABSTRACT 
[Abstract ] 

[Object] A scene description that conforms to the state 
of a transmission line or the processing ability of a 
receiving side can be transmitted. Consequently, occurrence 
of a drawback such as unexpected missing of part of a scene 
that is unintended to a transmitting side is avoided. Even 
when a transmission rate is changed, decoding can be 
performed according to a scene construction that conforms to 
the resultant transmission rate. A change in information 
needed to decode signals is explicitly reported to the 
receiving side. This relieves the receiving side of the 
necessity of sampling the information, which is needed to 
decode signals, from the signals. 

[Solving Means] A data transmission system consists 
mainly of a server 10 that transmits a scene description 
which describes the structure of multimedia data in a scene 
and a receiving terminal 20 that constructs the scene 
according to the scene description. The server 10 includes 
a scene description processing unit 2 that transfers a scene 
description which conforms to the state of a transmission 
line and/or a request issued from the receiving terminal 20. 
[Selected Figure] Fig. 1 
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